{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "tghWegsjhpkt"
      },
      "source": [
        "##### Copyright &copy; 2020 The TensorFlow Authors."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "cellView": "form",
        "colab": {},
        "colab_type": "code",
        "id": "rSGJWC5biBiG"
      },
      "outputs": [],
      "source": [
        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "YuSYVbwEYNHw"
      },
      "source": [
        "# TensorFlow Model Analysis\n",
        "***An Example of a Key Component of TensorFlow Extended (TFX)***"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "rLsMb4vqY244"
      },
      "source": [
        "Note: You can run this example right now in a Jupyter-style notebook, no setup required!  Just click \"Run in Google Colab\"\n",
        "\n",
        "\u003cdiv class=\"devsite-table-wrapper\"\u003e\u003ctable class=\"tfo-notebook-buttons\" align=\"left\"\u003e\n",
        "\u003ctd\u003e\u003ca target=\"_blank\" href=\"https://www.tensorflow.org/tfx/tutorials/model_analysis/tfma_basic\"\u003e\n",
        "\u003cimg src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" /\u003eView on TensorFlow.org\u003c/a\u003e\u003c/td\u003e\n",
        "\u003ctd\u003e\u003ca target=\"_blank\" href=\"https://colab.sandbox.google.com/github/tensorflow/tfx/blob/master/docs/tutorials/model_analysis/tfma_basic.ipynb\"\u003e\n",
        "\u003cimg src=\"https://www.tensorflow.org/images/colab_logo_32px.png\"\u003eRun in Google Colab\u003c/a\u003e\u003c/td\u003e\n",
        "\u003ctd\u003e\u003ca target=\"_blank\" href=\"https://github.com/tensorflow/tfx/blob/master/docs/tutorials/model_analysis/tfma_basic.ipynb\"\u003e\n",
        "\u003cimg width=32px src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\"\u003eView source on GitHub\u003c/a\u003e\u003c/td\u003e\n",
        "\u003c/table\u003e\u003c/div\u003e"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "mPt5BHTwy_0F"
      },
      "source": [
        "This example colab notebook illustrates how TensorFlow Model Analysis (TFMA) can be used to investigate and visualize the characteristics of a dataset and the performance of a model.  We'll use a model that we trained previously, and now you get to play with the results!\n",
        "\n",
        "The model we trained was for the [Chicago Taxi Example](https://github.com/tensorflow/model-analysis/tree/master/examples/chicago_taxi), which uses the [Taxi Trips dataset](https://data.cityofchicago.org/Transportation/Taxi-Trips/wrvz-psew) released by the City of Chicago.\n",
        "\n",
        "Note: This site provides applications using data that has been modified for use from its original source, www.cityofchicago.org, the official website of the City of Chicago. The City of Chicago makes no claims as to the content, accuracy, timeliness, or completeness of any of the data provided at this site. The data provided at this site is subject to change at any time. It is understood that the data provided at this site is being used at one’s own risk.\n",
        "\n",
        "[Read more](https://cloud.google.com/bigquery/public-data/chicago-taxi) about the dataset in [Google BigQuery](https://cloud.google.com/bigquery/). Explore the full dataset in the [BigQuery UI](https://bigquery.cloud.google.com/dataset/bigquery-public-data:chicago_taxi_trips).\n",
        "\n",
        "Key Point: As a modeler and developer, think about how this data is used and the potential benefits and harm a model's predictions can cause. A model like this could reinforce societal biases and disparities. Is a feature relevant to the problem you want to solve or will it introduce bias? For more information, read about \u003ca target='_blank' href='https://developers.google.com/machine-learning/fairness-overview/'\u003eML fairness\u003c/a\u003e.\n",
        "\n",
        "Key Point: In order to understand `TFMA` and how it works with Apache Beam, you'll need to know a little bit about Apache Beam itself.  The \u003ca target='_blank' href='https://beam.apache.org/documentation/programming-guide/'\u003eBeam Programming Guide\u003c/a\u003e is a great place to start."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "Fnm6Mj3vTGLm"
      },
      "source": [
        "The columns in the dataset are:\n",
        "\u003ctable\u003e\n",
        "\u003ctr\u003e\u003ctd\u003epickup_community_area\u003c/td\u003e\u003ctd\u003efare\u003c/td\u003e\u003ctd\u003etrip_start_month\u003c/td\u003e\u003c/tr\u003e\n",
        "\n",
        "\u003ctr\u003e\u003ctd\u003etrip_start_hour\u003c/td\u003e\u003ctd\u003etrip_start_day\u003c/td\u003e\u003ctd\u003etrip_start_timestamp\u003c/td\u003e\u003c/tr\u003e\n",
        "\u003ctr\u003e\u003ctd\u003epickup_latitude\u003c/td\u003e\u003ctd\u003epickup_longitude\u003c/td\u003e\u003ctd\u003edropoff_latitude\u003c/td\u003e\u003c/tr\u003e\n",
        "\u003ctr\u003e\u003ctd\u003edropoff_longitude\u003c/td\u003e\u003ctd\u003etrip_miles\u003c/td\u003e\u003ctd\u003epickup_census_tract\u003c/td\u003e\u003c/tr\u003e\n",
        "\u003ctr\u003e\u003ctd\u003edropoff_census_tract\u003c/td\u003e\u003ctd\u003epayment_type\u003c/td\u003e\u003ctd\u003ecompany\u003c/td\u003e\u003c/tr\u003e\n",
        "\u003ctr\u003e\u003ctd\u003etrip_seconds\u003c/td\u003e\u003ctd\u003edropoff_community_area\u003c/td\u003e\u003ctd\u003etips\u003c/td\u003e\u003c/tr\u003e\n",
        "\u003c/table\u003e"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "q7-ouHFnWAsu"
      },
      "source": [
        "## Install Jupyter Extensions\n",
        "Note: If running in a local Jupyter notebook, then these Jupyter extensions must be installed in the environment before running Jupyter.\n",
        "\n",
        "```bash\n",
        "jupyter nbextension enable --py widgetsnbextension\n",
        "jupyter nbextension install --py --symlink tensorflow_model_analysis\n",
        "jupyter nbextension enable --py tensorflow_model_analysis\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "LZj-impiAD_l"
      },
      "source": [
        "## Install TensorFlow Model Analysis (TFMA)\n",
        "\n",
        "This will pull in all the dependencies, and will take a minute.  Please ignore the warnings."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "0YTVfGPTe69L"
      },
      "outputs": [],
      "source": [
        "import sys\n",
        "\n",
        "# Confirm that we're using Python 3\n",
        "assert sys.version_info.major is 3, 'Oops, not running Python 3. Use Runtime \u003e Change runtime type'"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "SA2E343NAMRF"
      },
      "outputs": [],
      "source": [
        "import tensorflow as tf\n",
        "print('TF version: {}'.format(tf.__version__))\n",
        "\n",
        "print('Installing Apache Beam')\n",
        "!pip install -Uq apache_beam==2.17.0\n",
        "import apache_beam as beam\n",
        "print('Beam version: {}'.format(beam.__version__))\n",
        "\n",
        "# Install TFMA\n",
        "# This will pull in all the dependencies, and will take a minute\n",
        "# Please ignore the warnings\n",
        "!pip install -q tensorflow-model-analysis==0.21.3\n",
        "\n",
        "import tensorflow as tf\n",
        "import tensorflow_model_analysis as tfma\n",
        "print('TFMA version: {}'.format(tfma.version.VERSION_STRING))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "RptgLn2RYuK3"
      },
      "source": [
        "## Load The Files\n",
        "We'll download a tar file that has everything we need.  That includes:\n",
        "\n",
        "* Training and evaluation datasets\n",
        "* Data schema\n",
        "* Training results as EvalSavedModels\n",
        "\n",
        "Note: We are downloading with HTTPS from a Google Cloud server."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "K4QXVIM7iglN"
      },
      "outputs": [],
      "source": [
        "# Download the tar file from GCP and extract it\n",
        "import io, os, tempfile\n",
        "BASE_DIR = tempfile.mkdtemp()\n",
        "TFMA_DIR = os.path.join(BASE_DIR, 'eval_saved_models-0.15.0')\n",
        "DATA_DIR = os.path.join(TFMA_DIR, 'data')\n",
        "OUTPUT_DIR = os.path.join(TFMA_DIR, 'output')\n",
        "SCHEMA = os.path.join(TFMA_DIR, 'schema.pbtxt')\n",
        "\n",
        "!wget https://storage.googleapis.com/artifacts.tfx-oss-public.appspot.com/datasets/eval_saved_models-0.15.0.tar\n",
        "!tar xf eval_saved_models-0.15.0.tar\n",
        "!mv eval_saved_models-0.15.0 {BASE_DIR}\n",
        "!rm eval_saved_models-0.15.0.tar\n",
        "\n",
        "print(\"Here's what we downloaded:\")\n",
        "!ls -R {TFMA_DIR}"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "_xa7ZDV1MycO"
      },
      "source": [
        "## Parse the Schema\n",
        "\n",
        "Among the things we downloaded was a schema for our data that was created by [TensorFlow Data Validation](https://www.tensorflow.org/tfx/data_validation/).  Let's parse that now so that we can use it with TFMA."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "uW5eB4TPcwFw"
      },
      "outputs": [],
      "source": [
        "from google.protobuf import text_format\n",
        "from tensorflow.python.lib.io import file_io\n",
        "from tensorflow_metadata.proto.v0 import schema_pb2\n",
        "from tensorflow.core.example import example_pb2\n",
        "\n",
        "schema = schema_pb2.Schema()\n",
        "contents = file_io.read_file_to_string(SCHEMA)\n",
        "schema = text_format.Parse(contents, schema)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "UP3yuJxfNXRL"
      },
      "source": [
        "## Use the Schema to Create TFRecords\n",
        "\n",
        "We need to give TFMA access to our dataset, so let's create a TFRecords file.  We can use our schema to create it, since it gives us the correct type for each feature."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "8-wud3fPczl6"
      },
      "outputs": [],
      "source": [
        "import csv\n",
        "\n",
        "datafile = os.path.join(DATA_DIR, 'eval', 'data.csv')\n",
        "reader = csv.DictReader(open(datafile, 'r'))\n",
        "examples = []\n",
        "for line in reader:\n",
        "  example = example_pb2.Example()\n",
        "  for feature in schema.feature:\n",
        "    key = feature.name\n",
        "    if len(line[key]) \u003e 0:\n",
        "      if feature.type == schema_pb2.FLOAT:\n",
        "        example.features.feature[key].float_list.value[:] = [float(line[key])]\n",
        "      elif feature.type == schema_pb2.INT:\n",
        "        example.features.feature[key].int64_list.value[:] = [int(line[key])]\n",
        "      elif feature.type == schema_pb2.BYTES:\n",
        "        example.features.feature[key].bytes_list.value[:] = [line[key].encode('utf8')]\n",
        "    else:\n",
        "      if feature.type == schema_pb2.FLOAT:\n",
        "        example.features.feature[key].float_list.value[:] = []\n",
        "      elif feature.type == schema_pb2.INT:\n",
        "        example.features.feature[key].int64_list.value[:] = []\n",
        "      elif feature.type == schema_pb2.BYTES:\n",
        "        example.features.feature[key].bytes_list.value[:] = []\n",
        "  examples.append(example)\n",
        "\n",
        "TFRecord_file = os.path.join(BASE_DIR, 'train_data.rio')\n",
        "with tf.io.TFRecordWriter(TFRecord_file) as writer:\n",
        "  for example in examples:\n",
        "    writer.write(example.SerializeToString())\n",
        "  writer.flush()\n",
        "  writer.close()\n",
        "\n",
        "!ls {TFRecord_file}"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "Qm5luW1EN7g7"
      },
      "source": [
        "## Run TFMA and Render Metrics\n",
        "\n",
        "Now we're ready to create a function that we'll use to run TFMA and render metrics.  It requires an [`EvalSavedModel`](https://www.tensorflow.org/api_docs/python/tf/saved_model), a list of [`SliceSpecs`](https://www.tensorflow.org/tfx/model_analysis/api_docs/python/tfma/SingleSliceSpec), and an index into the SliceSpec list.  It will create an EvalResult using [`tfma.run_model_analysis`](https://www.tensorflow.org/tfx/model_analysis/api_docs/python/tfma/run_model_analysis), and use it to create a `SlicingMetricsViewer` using [`tfma.view.render_slicing_metrics`](https://www.tensorflow.org/tfx/model_analysis/api_docs/python/tfma/view/render_slicing_metrics), which will render a visualization of our dataset using the slice we created."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "H2wANNF_2dCR"
      },
      "outputs": [],
      "source": [
        "def run_and_render(eval_model=None, slice_list=None, slice_idx=0):\n",
        "  \"\"\"Runs the model analysis and renders the slicing metrics\n",
        "\n",
        "  Args:\n",
        "      eval_model: An instance of tf.saved_model saved with evaluation data\n",
        "      slice_list: A list of tfma.slicer.SingleSliceSpec giving the slices\n",
        "      slice_idx: An integer index into slice_list specifying the slice to use\n",
        "\n",
        "  Returns:\n",
        "      A SlicingMetricsViewer object if in Jupyter notebook; None if in Colab.\n",
        "  \"\"\"\n",
        "  eval_result = tfma.run_model_analysis(eval_shared_model=eval_model,\n",
        "                                          data_location=TFRecord_file,\n",
        "                                          file_format='tfrecords',\n",
        "                                          slice_spec=slice_list,\n",
        "                                          output_path='sample_data',\n",
        "                                          extractors=None)\n",
        "  return tfma.view.render_slicing_metrics(eval_result, slicing_spec=slice_list[slice_idx])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "cSl9qyTCbBKR"
      },
      "source": [
        "### Slicing and Dicing\n",
        "\n",
        "We previously trained a model, and now we've loaded the results.  Let's take a look at our visualizations, starting with using TFMA to slice along particular features.  But first we need to read in the EvalSavedModel from one of our previous training runs.\n",
        "\n",
        "* To define the slice you want to visualize you create a [`tfma.slicer.SingleSliceSpec`](https://www.tensorflow.org/tfx/model_analysis/api_docs/python/tfma/SingleSliceSpec)\n",
        "\n",
        "* To use [`tfma.view.render_slicing_metrics`](https://www.tensorflow.org/tfx/model_analysis/api_docs/python/tfma/view/render_slicing_metrics) you can either use the name of the column (by setting `slicing_column`) or provide a [`tfma.slicer.SingleSliceSpec`](https://www.tensorflow.org/tfx/model_analysis/api_docs/python/tfma/SingleSliceSpec) (by setting `slicing_spec`)\n",
        "* If neither is provided, the overview will be displayed\n",
        "\n",
        "Plots are interactive:\n",
        "\n",
        "* Click and drag to pan\n",
        "* Scroll to zoom\n",
        "* Right click to reset the view\n",
        "\n",
        "Simply hover over the desired data point to see more details.  Select from four different types of plots using the selections at the bottom.\n",
        "\n",
        "For example, we'll be setting `slicing_column` to look at the `trip_start_hour` feature in our `SliceSpec`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "hJ5_UMnWYmaE"
      },
      "outputs": [],
      "source": [
        "# Load the TFMA results for the first training run\n",
        "# This will take a minute\n",
        "eval_model_base_dir_0 = os.path.join(TFMA_DIR, 'run_0', 'eval_model_dir')\n",
        "eval_model_dir_0 = os.path.join(eval_model_base_dir_0, next(os.walk(eval_model_base_dir_0))[1][0])\n",
        "eval_shared_model_0 = tfma.default_eval_shared_model(eval_saved_model_path=eval_model_dir_0)\n",
        "\n",
        "# Slice our data by the trip_start_hour feature\n",
        "slices = [tfma.slicer.SingleSliceSpec(columns=['trip_start_hour'])]\n",
        "\n",
        "run_and_render(eval_model=eval_shared_model_0, slice_list=slices, slice_idx=0)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "LJuxvGCpn4yF"
      },
      "source": [
        "### Slices Overview\n",
        "\n",
        "The default visualization is the **Slices Overview** when the number of slices is small. It shows the values of metrics for each slice. Since we've selected `trip_start_hour` above, it's showing us metrics like accuracy and AUC for each hour, which allows us to look for issues that are specific to some hours and not others.\n",
        "\n",
        "In the visualization above:\n",
        "\n",
        "* Try sorting the feature column, which is our `trip_start_hours` feature, by clicking on the column header\n",
        "* Try sorting by precision, and **notice that the precision for some of the hours with examples is 0, which may indicate a problem**\n",
        "\n",
        "The chart also allows us to select and display different metrics in our slices.\n",
        "\n",
        "* Try selecting different metrics from the \"Show\" menu\n",
        "* Try selecting recall in the \"Show\" menu, and **notice that the recall for some of the hours with examples is 0, which may indicate a problem**\n",
        "\n",
        "It is also possible to set a threshold to filter out slices with smaller numbers of examples, or \"weights\".  You can type a minimum number of examples, or use the slider."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "cQT-1Ckcnd_7"
      },
      "source": [
        "### Metrics Histogram\n",
        "\n",
        "This view also supports a **Metrics Histogram** as an alternative visualization, which is also the default view when the number of slices is large. The results will be divided into buckets and the number of slices / total weights / both can be visualized. Columns can be sorted by clicking on the column header.  Slices with small weights can be filtered out by setting the threshold. Further filtering can be applied by dragging the grey band. To reset the range, double click the band. Filtering can also be used to remove outliers in the visualization and the metrics tables. Click the gear icon to switch to a logarithmic scale instead of a linear scale.\n",
        "\n",
        "* Try selecting \"Metrics Histogram\" in the Visualization menu\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "hSnqI6Esb1XM"
      },
      "source": [
        "### More Slices\n",
        "\n",
        "Let's create a whole list of `SliceSpec`s, which will allow us to select any of the slices in the list.  We'll select the `trip_start_day` slice (days of the week) by setting the `slice_idx` to `1`.  **Try changing the `slice_idx` to `0` or `2` and running again to examine different slices.**"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "355wqvY3yBod"
      },
      "outputs": [],
      "source": [
        "slices = [tfma.slicer.SingleSliceSpec(columns=['trip_start_hour']),\n",
        "          tfma.slicer.SingleSliceSpec(columns=['trip_start_day']),\n",
        "          tfma.slicer.SingleSliceSpec(columns=['trip_start_month'])]\n",
        "run_and_render(eval_model=eval_shared_model_0, slice_list=slices, slice_idx=1)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "PsXM0NYGeajg"
      },
      "source": [
        "You can create feature crosses to analyze combinations of features.  Let's create a `SliceSpec` to look at a cross of `trip_start_day` and `trip_start_hour`:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "k7vbFS1Me1SH"
      },
      "outputs": [],
      "source": [
        "slices = [tfma.slicer.SingleSliceSpec(columns=['trip_start_day', 'trip_start_hour'])]\n",
        "run_and_render(eval_shared_model_0, slices, 0)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "GmeODqrUfJw2"
      },
      "source": [
        "Crossing the two columns creates a lot of combinations!  Let's narrow down our cross to only look at **trips that start at noon**.  Then let's select `accuracy` from the visualization:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "kdvBNfcHfRWg"
      },
      "outputs": [],
      "source": [
        "slices = [tfma.slicer.SingleSliceSpec(columns=['trip_start_day'], features=[('trip_start_hour', 12)])]\n",
        "run_and_render(eval_shared_model_0, slices, 0)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "meRvFkKcPbux"
      },
      "source": [
        "## Tracking Model Performance Over Time\n",
        "\n",
        "Your training dataset will be used for training your model, and will hopefully be representative of your test dataset and the data that will be sent to your model in production.  However, while the data in inference requests may remain the same as your training data, in many cases it will start to change enough so that the performance of your model will change.\n",
        "\n",
        "That means that you need to monitor and measure your model's performance on an ongoing basis, so that you can be aware of and react to changes.  Let's take a look at how TFMA can help."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "Vm2y2DqNF4HL"
      },
      "source": [
        "### Measure Performance For New Data\n",
        "We downloaded the results of three different training runs above, so let's load them now and use TFMA to see how they compare using [`render_time_series`](https://www.tensorflow.org/tfx/model_analysis/api_docs/python/tfma/view/render_time_series).  We can specify particular slices to look at.  Let's compare our training runs for trips that started at noon.\n",
        "\n",
        "* Select a metric from the dropdown to add the time series graph for that metric\n",
        "* Close unwanted graphs\n",
        "* Hover over data points (the ends of line segments in the graph) to get more details"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "zJYUOjmFfuPy"
      },
      "outputs": [],
      "source": [
        "def get_eval_result(base_dir, output_dir, data_loc, slice_spec):\n",
        "  eval_model_dir = os.path.join(base_dir, next(os.walk(base_dir))[1][0])\n",
        "  eval_shared_model = tfma.default_eval_shared_model(eval_saved_model_path=eval_model_dir)\n",
        "\n",
        "  return tfma.run_model_analysis(eval_shared_model=eval_shared_model,\n",
        "                                          data_location=data_loc,\n",
        "                                          file_format='tfrecords',\n",
        "                                          slice_spec=slice_spec,\n",
        "                                          output_path=output_dir,\n",
        "                                          extractors=None)\n",
        "\n",
        "slices = [tfma.slicer.SingleSliceSpec()]\n",
        "output_dir_0 = os.path.join(TFMA_DIR, 'output', 'run_0')\n",
        "result_ts0 = get_eval_result(os.path.join(TFMA_DIR, 'run_0', 'eval_model_dir'),\n",
        "                             output_dir_0, TFRecord_file, slices)\n",
        "output_dir_1 = os.path.join(TFMA_DIR, 'output', 'run_1')\n",
        "result_ts1 = get_eval_result(os.path.join(TFMA_DIR, 'run_1', 'eval_model_dir'),\n",
        "                             output_dir_1, TFRecord_file, slices)\n",
        "output_dir_2 = os.path.join(TFMA_DIR, 'output', 'run_2')\n",
        "result_ts2 = get_eval_result(os.path.join(TFMA_DIR, 'run_2', 'eval_model_dir'),\n",
        "                             output_dir_2, TFRecord_file, slices)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "RsO-gqCRK0ar"
      },
      "source": [
        "### How does it look today?\n",
        "First, we'll imagine that we've trained and deployed our model yesterday, and now we want to see how it's doing on the new data coming in today.  The visualization will start by displaying accuracy.  Add AUC and average loss by using the \"Add metric series\" menu.\n",
        "\n",
        "Note: In the metric series charts the X axis is the model ID number of the model run that you're examining.  The numbers themselves are not meaningful."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "KjEws8T0cDm9"
      },
      "outputs": [],
      "source": [
        "eval_results_from_disk = tfma.load_eval_results([output_dir_0, output_dir_1],\n",
        "                                                tfma.constants.MODEL_CENTRIC_MODE)\n",
        "\n",
        "tfma.view.render_time_series(eval_results_from_disk, slices[0])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "EQ7kZxESN9Bx"
      },
      "source": [
        "Now we'll imagine that another day has passed and we want to see how it's doing on the new data coming in today, compared to the previous two days. Again add AUC and average loss by using the \"Add metric series\" menu:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "VjQmlXMmLwHf"
      },
      "outputs": [],
      "source": [
        "eval_results_from_disk = tfma.load_eval_results([output_dir_0, output_dir_1, output_dir_2],\n",
        "                                                tfma.constants.MODEL_CENTRIC_MODE)\n",
        "\n",
        "tfma.view.render_time_series(eval_results_from_disk, slices[0])"
      ]
    }
  ],
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "collapsed_sections": [
        "tghWegsjhpkt"
      ],
      "name": "tfma_basic.ipynb",
      "private_outputs": true,
      "provenance": [],
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
